Fifa 21 is a video game based on professional soccer players. Fifa 21 has a few game modes, with ultimate team being the one played the most. The goal of ultimate team is to create the best possible team of players and to play against other gamers online. In order to obtain players, gamers can purchase packs, which gives random players, or gamers can bid or buy players directly from the market with coins which are acquired by playing matches against other online players or by purchasing them. Ground players and goalkeepers are graded by Muller-Mohring who gives each player’s skills and attributes a whole number from 1 to 99 based on their performance in club and international soccer games. The market allows gamers to trade players for coins, and a key question to gamers is “How much should I spend on players in the market, and on what basis can I compare them?”. In this project, we wish to find the correlated factors that drive prices for goalkeepers, defenders, midfielders, and attackers in the market.
Gamer placing a 84 overall rated goalkeeper on the marketplace up for bid starting at 700 coins, as well as a buy now price, which was listed for 10,000 coins. The selling price was 4,500 coins.
We can divide all players into two categories - goalkeepers and ground players (defenders, midfielders, and attackers). The six attributes goalkeepers are graded on include diving, handling, kicking, reflexes, positioning, and speed.The six attributes ground players are graded on include dribbling, shooting, passing, dribbling, pace, and physicality. The purpose of this project is to answer the question “What factors are most important in predicting the price certain players will sell for on the ultimate team market?”. We hypothesize that goalkeepers’ prices will be largely correlated with the reflexes and diving, defenders’ prices will be reflected by physicality and defending, midfielders’ prices will be strongly correlated with passing, and attackers’ prices will be rely on shooting. The machine learning model will then be run to find what factors gamers consider the most when purchasing ground players and goalkeepers.
Some people considered projects similar to this one, but they either used a smaller dataset with “big name” players or only focused on one position. In this project, we wish to find a model that accurately reflects the factors that gamers consider when purchasing these players. Our dataset was obtained from Kaggle, including 17314 different cards with all their attributes and individual traits, including 95 columns of data with over 30 quantitative variables we are interested in assessing.
We were inspired to create a project about this topic because we have been shocked by the prices that players go for on the market when the new games get released. This dataset focused on the data from the release of the ultimate team game mode to when the the next Fifa was released. It includes the maximum, minimum, and last price of transactions for players in the marketplace across different consoles. We will be focusing primarily on the last price on the PS4 console.
The photo above displays the gold goalkeeper, the silver defenders and midfielders, and a bronze striker and a silver striker.
Courtois is one of the most highest rated goalkeepers in the game.
Notice that his 6 graded attributes are diving, handling, kicking,
reflexes, speed, and goalie positioning, which will differ from the
ground players. We believe that goalies’ prices will be most correlated
with diving (84) and reflexes (88).
Frenchman Killian Mbappe plays the striker position for PSG. Since strikers are attackers and thus ground players, their attributes are pace, shooting, passing, dribbling, defending, and physicality. We predict the most important factor for strikers is shooting (86).
What factors are most important in predicting the price certain players will sell for on the ultimate team market?
str(players_read)
## 'data.frame': 17614 obs. of 95 variables:
## $ futbin_id : int 1 2 3 4 5 6 7 8 9 10 ...
## $ player_name : chr "Shearer" "Keane" "Giggs" "Scholes" ...
## $ player_extended_name: chr "Alan Shearer" "Roy Keane" "Ryan Giggs" "Paul Scholes" ...
## $ quality : chr "Gold - Rare" "Gold - Rare" "Gold - Rare" "Gold - Rare" ...
## $ revision : chr "Icon" "Icon" "Icon" "Icon" ...
## $ origin : chr "Prime" "Prime" "Prime" "Prime" ...
## $ overall : int 91 90 92 91 85 90 88 90 88 92 ...
## $ club : chr "FUT 21 ICONS" "FUT 21 ICONS" "FUT 21 ICONS" "FUT 21 ICONS" ...
## $ league : chr "Icons" "Icons" "Icons" "Icons" ...
## $ nationality : chr "England" "Republic of Ireland" "Wales" "England" ...
## $ position : chr "ST" "CM" "LM" "CM" ...
## $ age : int 50 49 46 45 46 47 47 48 47 47 ...
## $ date_of_birth : chr "1970-08-13" "1971-08-10" "1973-11-29" "1974-11-16" ...
## $ height : int 182 180 179 171 188 173 185 180 168 178 ...
## $ weight : int 78 76 71 71 89 70 82 74 70 74 ...
## $ intl_rep : int 5 4 4 4 3 3 4 4 4 4 ...
## $ added_date : chr "2020-09-10" "2020-09-10" "2020-09-10" "2020-09-10" ...
## $ pace : int 81 72 90 72 84 87 82 86 90 86 ...
## $ pace_acceleration : int 82 70 91 73 81 90 83 83 91 85 ...
## $ pace_sprint_speed : int 80 73 90 71 86 84 81 88 90 87 ...
## $ dribbling : int 78 81 91 80 56 94 80 88 79 85 ...
## $ drib_agility : int 71 70 82 68 56 93 59 86 79 79 ...
## $ drib_balance : int 71 77 82 89 56 89 66 83 85 85 ...
## $ drib_reactions : int 87 92 84 89 83 84 89 89 92 91 ...
## $ drib_ball_control : int 82 87 91 87 57 93 83 92 84 92 ...
## $ drib_dribbling : int 76 78 94 75 51 96 83 86 74 81 ...
## $ drib_composure : int 88 92 88 88 77 89 88 82 81 88 ...
## $ shooting : int 93 71 80 87 43 83 88 81 81 63 ...
## $ shoot_positioning : int 92 71 88 92 41 88 94 89 72 77 ...
## $ shoot_finishing : int 95 70 83 84 40 79 90 75 74 62 ...
## $ shoot_shot_power : int 94 65 74 91 60 87 89 87 94 75 ...
## $ shoot_long_shots : int 86 84 76 94 35 89 85 85 90 50 ...
## $ shoot_volleys : int 93 58 83 85 33 81 83 81 76 60 ...
## $ shoot_penalties : int 94 71 86 73 37 84 75 84 79 69 ...
## $ passing : int 77 85 90 91 56 84 70 88 86 88 ...
## $ pass_vision : int 76 87 86 94 52 86 70 88 80 84 ...
## $ pass_crossing : int 77 73 94 89 46 81 57 86 86 95 ...
## $ pass_free_kick : int 86 72 83 73 16 94 68 92 93 58 ...
## $ pass_short : int 82 92 91 93 73 83 78 91 88 94 ...
## $ pass_long : int 63 89 88 94 59 79 65 86 82 88 ...
## $ pass_curve : int 81 70 90 76 33 92 76 86 91 60 ...
## $ defending : int 52 87 44 64 85 40 41 44 83 90 ...
## $ def_interceptions : int 44 94 46 82 84 46 41 66 91 93 ...
## $ def_heading : int 94 77 57 83 87 61 90 50 74 85 ...
## $ def_marking : int 28 80 37 56 86 33 45 32 79 88 ...
## $ def_stand_tackle : int 65 94 43 60 83 39 26 40 84 91 ...
## $ def_slid_tackle : int 55 87 46 42 87 33 27 41 86 89 ...
## $ physicality : int 85 89 67 82 85 60 84 79 85 81 ...
## $ phys_jumping : int 88 79 61 85 84 54 80 80 83 50 ...
## $ phys_stamina : int 84 93 89 94 73 72 78 88 88 92 ...
## $ phys_strength : int 88 86 60 73 91 62 91 84 84 77 ...
## $ phys_aggression : int 80 96 57 88 84 41 76 57 83 87 ...
## $ gk_diving : int NA NA NA NA NA NA NA NA NA NA ...
## $ gk_reflexes : int NA NA NA NA NA NA NA NA NA NA ...
## $ gk_handling : int NA NA NA NA NA NA NA NA NA NA ...
## $ gk_speed : int NA NA NA NA NA NA NA NA NA NA ...
## $ gk_kicking : int NA NA NA NA NA NA NA NA NA NA ...
## $ gk_positoning : int NA NA NA NA NA NA NA NA NA NA ...
## $ pref_foot : chr "Right" "Right" "Left" "Right" ...
## $ att_workrate : chr "High" "Med" "High" "High" ...
## $ def_workrate : chr "Med" "High" "Med" "Med" ...
## $ weak_foot : int 3 3 2 3 3 4 4 4 2 4 ...
## $ skill_moves : int 2 2 3 3 2 5 3 4 3 2 ...
## $ cb : int 66 88 55 71 84 51 58 58 84 87 ...
## $ rb : int 68 86 68 74 78 62 58 68 87 91 ...
## $ lb : int 68 86 68 74 78 62 58 68 87 91 ...
## $ rwb : int 69 86 73 77 74 66 61 73 87 91 ...
## $ lwb : int 69 86 73 77 74 66 61 73 87 91 ...
## $ cdm : int 67 90 68 80 76 63 61 71 86 90 ...
## $ cm : int 77 88 84 89 62 81 74 85 84 87 ...
## $ rm : int 81 81 90 86 60 86 79 87 83 86 ...
## $ lm : int 81 81 90 86 60 86 79 87 83 86 ...
## $ cam : int 82 83 88 87 58 88 81 88 82 83 ...
## $ cf : int 85 80 87 86 58 87 85 86 82 81 ...
## $ rf : int 85 80 87 86 58 87 85 86 82 81 ...
## $ lf : int 85 80 87 86 58 87 85 86 82 81 ...
## $ rw : int 82 79 89 84 57 88 81 87 82 83 ...
## $ lw : int 82 79 89 84 57 88 81 87 82 83 ...
## $ st : int 89 77 82 85 61 83 88 82 81 78 ...
## $ traits : chr "Power Header, Long Shot Taker (CPU AI Only), Power Free-Kick" "" "Speed Dribbler (CPU AI Only)" "Team Player, Playmaker (CPU AI Only), Long Shot Taker (CPU AI Only), Long Passer (CPU AI Only), Dives Into Tack"| __truncated__ ...
## $ specialities : logi NA NA NA NA NA NA ...
## $ base_id : int 51 240 241 246 388 570 942 1025 1040 1041 ...
## $ resource_id : int 51 240 241 246 388 570 942 1025 1040 1041 ...
## $ ps4_last : int NA NA NA NA 568000 NA 335000 NA 717000 NA ...
## $ ps4_min : int NA NA NA NA 64000 NA 66000 NA 66000 NA ...
## $ ps4_max : int NA NA NA NA 1100000 NA 550000 NA 1200000 NA ...
## $ ps4_prp : int 0 0 0 0 48 0 55 0 57 0 ...
## $ xbox_last : int NA NA NA NA 553000 NA 339000 NA 750000 NA ...
## $ xbox_min : int NA NA NA NA 64000 NA 66000 NA 66000 NA ...
## $ xbox_max : int NA NA NA NA 850000 NA 600000 NA 1200000 NA ...
## $ xbox_prp : int 0 0 0 0 62 0 51 0 60 0 ...
## $ pc_last : int NA NA NA NA 669000 NA 479000 NA NA NA ...
## $ pc_min : int NA NA NA NA 70000 NA 66000 NA 105000 NA ...
## $ pc_max : int NA NA NA NA 1300000 NA 750000 NA 2000000 NA ...
## $ pc_prp : int 0 0 0 0 48 0 60 0 73 0 ...
This dataset had many possible variables that represented player’s price. The first thing we did was to analyze which one of them was the most appropriate to be our target variable.
summary(players_read[,84:95])
## ps4_last ps4_min ps4_max ps4_prp
## Min. : 200 Min. : 150 Min. : 10000 Min. : 0.000
## 1st Qu.: 300 1st Qu.: 150 1st Qu.: 10000 1st Qu.: 0.000
## Median : 500 Median : 150 Median : 10000 Median : 2.000
## Mean : 15145 Mean : 1710 Mean : 33657 Mean : 6.456
## 3rd Qu.: 850 3rd Qu.: 250 3rd Qu.: 10000 3rd Qu.: 5.000
## Max. :10500000 Max. :853000 Max. :15000000 Max. :100.000
## NA's :723 NA's :247 NA's :247
## xbox_last xbox_min xbox_max xbox_prp
## Min. : 200 Min. : -1 Min. : -1 Min. : 0.000
## 1st Qu.: 550 1st Qu.: 150 1st Qu.: 10000 1st Qu.: 0.000
## Median : 850 Median : 250 Median : 10000 Median : 0.000
## Mean : 70142 Mean : 1829 Mean : 35237 Mean : 2.821
## 3rd Qu.: 2900 3rd Qu.: 300 3rd Qu.: 10000 3rd Qu.: 0.000
## Max. :11499000 Max. :801000 Max. :15000000 Max. :100.000
## NA's :14195 NA's :1692 NA's :1692
## pc_last pc_min pc_max pc_prp
## Min. : 200 Min. : -1 Min. : -1 Min. : 0.000
## 1st Qu.: 700 1st Qu.: 150 1st Qu.: 10000 1st Qu.: 0.000
## Median : 950 Median : 250 Median : 10000 Median : 0.000
## Mean : 94104 Mean : 1968 Mean : 39044 Mean : 3.009
## 3rd Qu.: 3700 3rd Qu.: 300 3rd Qu.: 10000 3rd Qu.: 0.000
## Max. :11880000 Max. :1086000 Max. :15000000 Max. :100.000
## NA's :14378 NA's :250 NA's :250
target_nas
As you can see in the bar plot, ps4 is the only viable choice. Since the ps4 max is the buy now price and ps4 min is the minimum first bid, they are not exactly appropriate for our investigation. Ps4_last is what players went for in the market, so we chose that
players_price_points
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
## Warning: Ignoring 623 observations
## Warning: 'scatter' objects don't have these attributes: 'trace'
## Valid attributes include:
## 'cliponaxis', 'connectgaps', 'customdata', 'customdatasrc', 'dx', 'dy', 'error_x', 'error_y', 'fill', 'fillcolor', 'groupnorm', 'hoverinfo', 'hoverinfosrc', 'hoverlabel', 'hoveron', 'hovertemplate', 'hovertemplatesrc', 'hovertext', 'hovertextsrc', 'ids', 'idssrc', 'legendgroup', 'legendgrouptitle', 'legendrank', 'line', 'marker', 'meta', 'metasrc', 'mode', 'name', 'opacity', 'orientation', 'selected', 'selectedpoints', 'showlegend', 'stackgaps', 'stackgroup', 'stream', 'text', 'textfont', 'textposition', 'textpositionsrc', 'textsrc', 'texttemplate', 'texttemplatesrc', 'transforms', 'type', 'uid', 'uirevision', 'unselected', 'visible', 'x', 'x0', 'xaxis', 'xcalendar', 'xhoverformat', 'xperiod', 'xperiod0', 'xperiodalignment', 'xsrc', 'y', 'y0', 'yaxis', 'ycalendar', 'yhoverformat', 'yperiod', 'yperiod0', 'yperiodalignment', 'ysrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
Next, we observed our distribution was pretty exponential. This made sense to us, since almost no one uses the worst players. In order to mitigate that, we decided to take a look at how the Icons team
league_stats
We compared leagues and plotted averages for the four most expensive leagues in fifa. Observe how icons is superior in almost everything, and absolutely dwarfs every other league’s normalized average price
Initially we used a random forest algorithm to predict players’ prices in FUT marketplace. After observing the density plot, we came to the conclusion that our result could be skewed due to outliers.
grid.arrange(keeper_elbow,defender_elbow,midfielder_elbow,attacker_elbow)
After using kmeans to cluster our data in terms of price, it did seem to
be the case that there were two very distinct groups in each
position
cluster_vis(keeper_cluster)
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
cluster_vis(defender_cluster)
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
cluster_vis(midfielder_cluster)
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
cluster_vis(attacker_cluster)
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
After running the algorithm, our suspicions were proved right
knitr::kable(keeper_outliers, caption="keeper")
| clusterNum | count |
|---|---|
| 1 | 15 |
| 2 | 1760 |
knitr::kable(defenders_outliers, caption="defender")
| clusterNum | count |
|---|---|
| 1 | 12 |
| 2 | 5450 |
knitr::kable(midfielders_outliers, caption="midfielder")
| clusterNum | count |
|---|---|
| 1 | 23 |
| 2 | 6245 |
knitr::kable(attackers_outliers, caption="attacker")
| clusterNum | count |
|---|---|
| 1 | 3173 |
| 2 | 12 |
So we removed the outliers from and run random forests on it again.
results <- data.frame(model=c("before","after"), keeper = c(b_keepers_rmse, keepers_rmse), defender = c(b_defenders_rmse,defenders_rmse),
midfielders = c(b_midfielders_rmse, midfielders_rmse), attackers = c(b_attackers_rmse, attackers_rmse))
knitr::kable(results, caption="Results")
| model | keeper | defender | midfielders | attackers |
|---|---|---|---|---|
| before | 0.0263155 | 0.0043878 | 0.0174412 | 0.0192237 |
| after | 0.0115551 | 0.0030319 | 0.0030592 | 0.0085265 |
b_keepers_importance
## %IncMSE IncNodePurity
## overall 0.000328 2.997712e-02
## drib_reactions 0.000289 3.949737e-02
## gk_handling 0.000254 3.046635e-02
## gk_diving 0.000244 2.491395e-02
## gk_positoning 0.000241 3.251602e-02
## gk_reflexes 0.000222 2.602549e-02
## drib_composure 0.000104 1.325417e-02
## drib_ball_control 0.000060 3.890848e-03
## shoot_shot_power 0.000057 1.117904e-02
## drib_agility 0.000056 2.428095e-03
## gk_kicking 0.000046 1.645373e-02
## intl_rep 0.000042 6.378351e-03
## pass_short 0.000025 8.990587e-03
## pass_vision 0.000020 5.270629e-03
## pace_sprint_speed 0.000016 5.027245e-03
## pass_long 0.000016 7.179887e-03
## height 0.000012 5.436466e-03
## phys_aggression 0.000011 2.559491e-03
## weight 0.000010 4.627273e-03
## gk_speed 0.000009 2.424587e-03
## revision 0.000008 3.624819e-03
## drib_dribbling 0.000006 2.616775e-03
## pass_curve 0.000005 1.817748e-04
## pace_acceleration 0.000004 1.980195e-03
## league 0.000003 9.214477e-04
## def_interceptions 0.000003 5.827528e-04
## quality 0.000002 6.297436e-05
## nationality 0.000002 1.922560e-04
## shoot_positioning 0.000002 1.446141e-03
## shoot_long_shots 0.000002 2.060462e-04
## shoot_finishing 0.000001 2.146898e-04
## shoot_penalties 0.000001 2.208135e-04
## pass_crossing 0.000001 1.741088e-04
## pass_free_kick 0.000001 4.674825e-04
## def_slid_tackle 0.000001 1.061267e-04
## phys_stamina 0.000001 2.242717e-04
## pref_foot 0.000001 6.246121e-04
## weak_foot 0.000001 1.032293e-04
## position 0.000000 0.000000e+00
## age 0.000000 1.722689e-04
## drib_balance 0.000000 1.012215e-04
## shoot_volleys 0.000000 2.661836e-03
## def_heading 0.000000 2.362376e-04
## def_marking 0.000000 5.216334e-04
## def_stand_tackle 0.000000 1.228142e-04
## att_workrate 0.000000 0.000000e+00
## def_workrate 0.000000 0.000000e+00
## club -0.000003 7.749004e-04
## phys_strength -0.000003 1.706323e-03
## phys_jumping -0.000007 4.507242e-04
keepers_importance
## %IncMSE IncNodePurity
## overall 3.3e-05 3.060551e-03
## gk_diving 2.6e-05 2.547787e-03
## gk_reflexes 2.5e-05 2.703097e-03
## gk_positoning 2.2e-05 2.685162e-03
## gk_handling 1.9e-05 1.749037e-03
## drib_reactions 1.7e-05 2.201472e-03
## revision 8.0e-06 1.028368e-03
## shoot_shot_power 8.0e-06 8.523003e-04
## gk_kicking 6.0e-06 7.411197e-04
## intl_rep 5.0e-06 7.135651e-04
## pace_acceleration 5.0e-06 3.448740e-04
## drib_composure 4.0e-06 7.280428e-04
## pace_sprint_speed 3.0e-06 5.703521e-04
## phys_jumping 3.0e-06 6.303861e-04
## gk_speed 3.0e-06 5.195172e-04
## drib_dribbling 2.0e-06 2.016260e-04
## def_heading 2.0e-06 1.732035e-04
## phys_stamina 2.0e-06 4.413155e-04
## quality 1.0e-06 8.966492e-05
## drib_agility 1.0e-06 3.288748e-04
## drib_ball_control 1.0e-06 2.191948e-04
## shoot_positioning 1.0e-06 1.798616e-04
## shoot_finishing 1.0e-06 2.088474e-04
## shoot_long_shots 1.0e-06 1.960946e-04
## shoot_volleys 1.0e-06 1.728552e-04
## def_interceptions 1.0e-06 5.485462e-04
## def_marking 1.0e-06 3.045702e-04
## player_extended_name 0.0e+00 4.454987e-04
## club 0.0e+00 1.185921e-04
## league 0.0e+00 2.325397e-04
## position 0.0e+00 0.000000e+00
## age 0.0e+00 1.557558e-04
## height 0.0e+00 2.138559e-04
## weight 0.0e+00 1.213210e-04
## drib_balance 0.0e+00 1.706189e-04
## shoot_penalties 0.0e+00 2.446450e-04
## pass_vision 0.0e+00 1.851818e-04
## pass_crossing 0.0e+00 8.788065e-05
## pass_free_kick 0.0e+00 1.387993e-04
## pass_short 0.0e+00 4.234920e-04
## pass_long 0.0e+00 4.337346e-04
## pass_curve 0.0e+00 2.862844e-04
## def_stand_tackle 0.0e+00 1.359132e-04
## def_slid_tackle 0.0e+00 6.349476e-05
## phys_strength 0.0e+00 1.511344e-04
## phys_aggression 0.0e+00 2.485939e-04
## pref_foot 0.0e+00 1.192546e-05
## att_workrate 0.0e+00 0.000000e+00
## def_workrate 0.0e+00 0.000000e+00
## weak_foot 0.0e+00 6.692633e-05
## nationality -1.0e-06 3.756909e-04
b_defenders_importance
## %IncMSE IncNodePurity
## lb 5e-06 8.056621e-04
## lwb 5e-06 9.929300e-04
## overall 4e-06 8.751707e-04
## pace 3e-06 3.567339e-04
## drib_agility 3e-06 7.182004e-05
## drib_reactions 3e-06 8.986059e-04
## lw 3e-06 2.198412e-04
## drib_composure 2e-06 9.194133e-04
## shoot_shot_power 2e-06 3.519381e-04
## pass_short 2e-06 7.548124e-04
## pass_long 2e-06 7.242510e-04
## def_stand_tackle 2e-06 6.151853e-04
## def_slid_tackle 2e-06 5.754300e-04
## cb 2e-06 4.998688e-04
## cdm 2e-06 5.831263e-04
## cm 2e-06 1.434728e-05
## defending 1e-06 1.104674e-04
## def_interceptions 1e-06 1.199720e-04
## def_heading 1e-06 5.458287e-04
## def_marking 1e-06 4.564051e-04
## physicality 1e-06 4.830134e-04
## rb 1e-06 3.768382e-04
## rwb 1e-06 8.156604e-05
## quality 0e+00 2.624444e-07
## revision 0e+00 1.116554e-04
## club 0e+00 1.644796e-06
## league 0e+00 6.951327e-07
## nationality 0e+00 1.428723e-05
## position 0e+00 2.771469e-07
## age 0e+00 6.041754e-07
## height 0e+00 5.009746e-07
## weight 0e+00 6.123336e-06
## intl_rep 0e+00 5.915010e-05
## pace_acceleration 0e+00 2.603608e-06
## pace_sprint_speed 0e+00 3.335824e-04
## dribbling 0e+00 4.338221e-05
## drib_balance 0e+00 7.335902e-07
## drib_ball_control 0e+00 2.359345e-05
## drib_dribbling 0e+00 2.858181e-05
## shooting 0e+00 2.652259e-05
## shoot_positioning 0e+00 2.810185e-05
## shoot_finishing 0e+00 7.600470e-05
## shoot_long_shots 0e+00 2.274084e-05
## shoot_volleys 0e+00 4.185311e-05
## shoot_penalties 0e+00 2.143363e-05
## passing 0e+00 1.917262e-05
## pass_vision 0e+00 1.177394e-05
## pass_crossing 0e+00 2.728887e-04
## pass_free_kick 0e+00 8.442069e-05
## pass_curve 0e+00 1.098820e-06
## phys_jumping 0e+00 9.142454e-05
## phys_stamina 0e+00 5.307352e-06
## phys_strength 0e+00 3.329438e-05
## phys_aggression 0e+00 9.024337e-06
## pref_foot 0e+00 8.056876e-08
## att_workrate 0e+00 3.180842e-07
## def_workrate 0e+00 2.668906e-06
## weak_foot 0e+00 3.046671e-05
## rm 0e+00 9.558760e-05
## lm 0e+00 5.329909e-06
## cam 0e+00 7.456003e-05
## cf 0e+00 3.830783e-05
## rf 0e+00 1.052651e-04
## lf 0e+00 7.464323e-05
## rw 0e+00 2.752778e-05
## st 0e+00 5.213014e-05
defenders_importance
## %IncMSE IncNodePurity
## overall 1e-06 9.657280e-05
## drib_ball_control 1e-06 1.619949e-05
## shoot_shot_power 1e-06 2.689455e-05
## lwb 1e-06 8.863313e-05
## cdm 1e-06 1.100938e-04
## rf 1e-06 2.532354e-05
## rw 1e-06 4.946883e-05
## player_extended_name 0e+00 5.708745e-07
## quality 0e+00 1.758870e-07
## revision 0e+00 5.070947e-05
## club 0e+00 1.177651e-06
## league 0e+00 5.418509e-07
## nationality 0e+00 7.450686e-07
## position 0e+00 7.710034e-07
## age 0e+00 4.302567e-06
## height 0e+00 5.875996e-07
## weight 0e+00 9.067035e-07
## intl_rep 0e+00 5.038992e-05
## pace 0e+00 2.285685e-06
## pace_acceleration 0e+00 5.777537e-05
## pace_sprint_speed 0e+00 4.690302e-05
## dribbling 0e+00 2.577121e-05
## drib_agility 0e+00 2.384084e-05
## drib_balance 0e+00 2.376684e-06
## drib_reactions 0e+00 4.594782e-05
## drib_dribbling 0e+00 2.786736e-05
## drib_composure 0e+00 8.625670e-06
## shooting 0e+00 5.005860e-06
## shoot_positioning 0e+00 3.672379e-05
## shoot_finishing 0e+00 2.434398e-05
## shoot_long_shots 0e+00 1.721042e-06
## shoot_volleys 0e+00 6.988004e-07
## shoot_penalties 0e+00 1.703097e-05
## passing 0e+00 3.161905e-06
## pass_vision 0e+00 4.175994e-05
## pass_crossing 0e+00 1.223611e-05
## pass_free_kick 0e+00 3.207162e-06
## pass_short 0e+00 5.076419e-05
## pass_long 0e+00 2.013014e-05
## pass_curve 0e+00 2.042452e-05
## defending 0e+00 7.643355e-05
## def_interceptions 0e+00 2.492318e-05
## def_heading 0e+00 1.247984e-05
## def_marking 0e+00 3.930964e-05
## def_stand_tackle 0e+00 8.723314e-05
## def_slid_tackle 0e+00 4.501706e-05
## physicality 0e+00 6.971734e-06
## phys_jumping 0e+00 1.752881e-05
## phys_stamina 0e+00 1.654016e-05
## phys_strength 0e+00 1.879251e-06
## phys_aggression 0e+00 6.493914e-06
## pref_foot 0e+00 1.835266e-08
## att_workrate 0e+00 8.589677e-08
## def_workrate 0e+00 3.005087e-08
## weak_foot 0e+00 4.566895e-06
## cb 0e+00 1.932377e-05
## rb 0e+00 4.867388e-05
## lb 0e+00 7.411448e-05
## rwb 0e+00 4.534939e-05
## cm 0e+00 3.824120e-05
## rm 0e+00 4.692690e-05
## lm 0e+00 3.715177e-05
## cam 0e+00 2.973403e-05
## cf 0e+00 2.099211e-05
## lf 0e+00 3.378846e-05
## lw 0e+00 1.756334e-05
## st 0e+00 5.174483e-05
b_midfielders_importance
## %IncMSE IncNodePurity
## overall 4e-06 1.044370e-03
## lf 4e-06 8.256541e-04
## rm 3e-06 5.709960e-04
## cf 3e-06 6.453340e-04
## drib_composure 2e-06 4.932833e-04
## shooting 2e-06 2.409442e-04
## shoot_positioning 2e-06 4.784694e-04
## lm 2e-06 2.377206e-04
## cam 2e-06 3.784593e-04
## lw 2e-06 1.984750e-04
## st 2e-06 2.644642e-04
## dribbling 1e-06 2.539899e-04
## drib_reactions 1e-06 2.367001e-04
## drib_dribbling 1e-06 2.794114e-04
## shoot_shot_power 1e-06 2.291863e-04
## shoot_volleys 1e-06 4.360402e-04
## shoot_penalties 1e-06 6.827525e-04
## pass_vision 1e-06 1.510492e-04
## pass_short 1e-06 1.531192e-04
## pass_long 1e-06 3.238956e-04
## def_marking 1e-06 5.013313e-04
## def_slid_tackle 1e-06 4.898701e-04
## cb 1e-06 1.770434e-04
## rb 1e-06 3.361529e-04
## lb 1e-06 1.547965e-04
## rwb 1e-06 2.460618e-04
## lwb 1e-06 2.638249e-04
## cdm 1e-06 6.989374e-05
## cm 1e-06 9.619646e-05
## rf 1e-06 5.911512e-04
## rw 1e-06 2.577196e-04
## quality 0e+00 2.798167e-07
## revision 0e+00 4.372818e-05
## club 0e+00 4.986279e-06
## league 0e+00 5.494798e-07
## nationality 0e+00 7.236325e-07
## position 0e+00 2.119051e-07
## age 0e+00 5.726065e-07
## height 0e+00 4.292610e-07
## weight 0e+00 1.587870e-05
## intl_rep 0e+00 4.308695e-05
## pace 0e+00 5.024281e-05
## pace_acceleration 0e+00 1.426719e-04
## pace_sprint_speed 0e+00 1.902391e-04
## drib_agility 0e+00 1.921258e-04
## drib_balance 0e+00 2.176832e-06
## drib_ball_control 0e+00 2.266651e-05
## shoot_finishing 0e+00 2.819130e-05
## shoot_long_shots 0e+00 1.044664e-04
## passing 0e+00 6.451621e-05
## pass_crossing 0e+00 6.896587e-05
## pass_free_kick 0e+00 1.430230e-04
## pass_curve 0e+00 2.393191e-04
## defending 0e+00 4.844405e-05
## def_interceptions 0e+00 2.117295e-04
## def_heading 0e+00 9.919471e-05
## def_stand_tackle 0e+00 1.238809e-04
## physicality 0e+00 1.014808e-04
## phys_jumping 0e+00 1.586738e-04
## phys_stamina 0e+00 2.333841e-04
## phys_strength 0e+00 1.369067e-04
## phys_aggression 0e+00 1.287430e-04
## pref_foot 0e+00 8.241973e-08
## att_workrate 0e+00 1.298129e-07
## def_workrate 0e+00 1.257182e-07
## weak_foot 0e+00 1.105315e-07
midfielders_importance
## %IncMSE IncNodePurity
## lm 1e-06 6.472140e-05
## cam 1e-06 9.194850e-05
## cf 1e-06 7.948094e-05
## rw 1e-06 7.617333e-05
## player_extended_name 0e+00 1.186416e-06
## quality 0e+00 4.819768e-07
## revision 0e+00 6.232980e-07
## overall 0e+00 3.690083e-05
## club 0e+00 4.961494e-07
## league 0e+00 4.863905e-07
## nationality 0e+00 9.449357e-07
## position 0e+00 2.003312e-07
## age 0e+00 2.409901e-06
## height 0e+00 1.805600e-06
## weight 0e+00 4.204567e-07
## intl_rep 0e+00 6.030560e-06
## pace 0e+00 8.862275e-06
## pace_acceleration 0e+00 1.422897e-05
## pace_sprint_speed 0e+00 8.647669e-06
## dribbling 0e+00 4.725044e-05
## drib_agility 0e+00 1.769071e-05
## drib_balance 0e+00 2.152299e-06
## drib_reactions 0e+00 4.172448e-05
## drib_ball_control 0e+00 2.572026e-05
## drib_dribbling 0e+00 2.221821e-05
## drib_composure 0e+00 2.213267e-05
## shooting 0e+00 3.043551e-05
## shoot_positioning 0e+00 1.827975e-05
## shoot_finishing 0e+00 2.836150e-05
## shoot_shot_power 0e+00 1.834727e-05
## shoot_long_shots 0e+00 3.534368e-06
## shoot_volleys 0e+00 1.763327e-05
## shoot_penalties 0e+00 1.041331e-05
## passing 0e+00 3.462530e-05
## pass_vision 0e+00 1.658021e-05
## pass_crossing 0e+00 2.460353e-05
## pass_free_kick 0e+00 6.194717e-07
## pass_short 0e+00 6.481123e-06
## pass_long 0e+00 6.369939e-06
## pass_curve 0e+00 1.140177e-05
## defending 0e+00 2.796541e-05
## def_interceptions 0e+00 3.018195e-05
## def_heading 0e+00 4.053867e-06
## def_marking 0e+00 3.083723e-05
## def_stand_tackle 0e+00 4.189796e-05
## def_slid_tackle 0e+00 9.146947e-07
## physicality 0e+00 8.741310e-06
## phys_jumping 0e+00 1.536880e-05
## phys_stamina 0e+00 3.702801e-05
## phys_strength 0e+00 4.726704e-07
## phys_aggression 0e+00 1.028418e-05
## pref_foot 0e+00 7.378189e-08
## att_workrate 0e+00 1.384827e-08
## def_workrate 0e+00 5.471447e-08
## weak_foot 0e+00 1.421496e-07
## cb 0e+00 4.249584e-05
## rb 0e+00 2.187713e-05
## lb 0e+00 4.670991e-06
## rwb 0e+00 1.756947e-05
## lwb 0e+00 1.648754e-05
## cdm 0e+00 4.498820e-05
## cm 0e+00 5.448419e-05
## rm 0e+00 5.060782e-05
## rf 0e+00 7.229695e-05
## lf 0e+00 7.306347e-05
## lw 0e+00 3.118357e-05
## st 0e+00 3.059317e-05
b_attackers_importance
## %IncMSE IncNodePurity
## cf 0.000116 1.573098e-02
## rf 0.000099 1.064767e-02
## rm 0.000083 1.647672e-02
## rw 0.000081 1.147529e-02
## dribbling 0.000068 9.304218e-03
## st 0.000068 9.873736e-03
## drib_reactions 0.000067 5.631062e-03
## overall 0.000066 6.457663e-03
## lw 0.000061 9.081608e-03
## lm 0.000060 7.081716e-03
## cam 0.000059 9.832530e-03
## lf 0.000055 1.102701e-02
## drib_ball_control 0.000050 1.367764e-02
## shooting 0.000042 6.993593e-03
## drib_composure 0.000039 5.410223e-03
## shoot_finishing 0.000039 6.045549e-03
## shoot_positioning 0.000037 4.019045e-03
## shoot_volleys 0.000025 9.484268e-04
## drib_dribbling 0.000021 7.271862e-03
## pace_sprint_speed 0.000016 8.025471e-03
## pass_short 0.000015 2.326702e-03
## pace 0.000013 6.001868e-03
## passing 0.000013 3.329707e-03
## pass_vision 0.000012 2.322818e-03
## rb 0.000011 2.346824e-04
## age 0.000010 1.450345e-03
## shoot_shot_power 0.000010 4.234831e-03
## shoot_long_shots 0.000009 3.045617e-03
## cdm 0.000009 7.254409e-04
## cm 0.000009 1.270487e-03
## intl_rep 0.000006 7.192945e-04
## pace_acceleration 0.000006 6.415201e-04
## pass_free_kick 0.000006 2.288542e-03
## pass_long 0.000006 2.182713e-03
## pass_curve 0.000005 3.972793e-04
## drib_agility 0.000004 2.998091e-03
## pass_crossing 0.000004 1.741411e-03
## drib_balance 0.000003 1.485545e-03
## def_heading 0.000002 3.819282e-03
## phys_stamina 0.000002 5.938079e-04
## lwb 0.000002 3.909695e-04
## revision 0.000001 1.298154e-04
## shoot_penalties 0.000001 1.141111e-03
## physicality 0.000001 2.032928e-03
## weak_foot 0.000001 9.784076e-04
## rwb 0.000001 9.672788e-04
## quality 0.000000 1.883215e-07
## club 0.000000 1.229483e-06
## league 0.000000 6.940603e-07
## nationality 0.000000 2.773274e-05
## position 0.000000 2.063730e-07
## height 0.000000 3.906158e-05
## defending 0.000000 2.553051e-05
## def_interceptions 0.000000 1.270407e-06
## def_stand_tackle 0.000000 8.615075e-07
## def_slid_tackle 0.000000 7.958917e-05
## phys_jumping 0.000000 7.451826e-03
## phys_strength 0.000000 5.812319e-05
## phys_aggression 0.000000 2.158711e-05
## pref_foot 0.000000 1.488193e-07
## att_workrate 0.000000 2.132844e-07
## def_workrate 0.000000 3.893696e-07
## cb 0.000000 1.350822e-04
## lb 0.000000 1.320112e-04
## weight -0.000001 7.127787e-05
## def_marking -0.000001 7.234383e-06
attackers_importance
## %IncMSE IncNodePurity
## cf 1.7e-05 1.898428e-03
## lf 1.7e-05 1.528982e-03
## rf 1.6e-05 1.381430e-03
## lw 1.5e-05 2.036822e-03
## st 1.4e-05 1.545659e-03
## rw 1.3e-05 1.390303e-03
## overall 1.0e-05 1.291125e-03
## shoot_positioning 1.0e-05 1.016777e-03
## cam 1.0e-05 1.094001e-03
## dribbling 9.0e-06 8.900513e-04
## shoot_finishing 8.0e-06 1.331533e-03
## drib_reactions 7.0e-06 6.917522e-04
## shooting 7.0e-06 7.390452e-04
## rm 7.0e-06 5.924733e-04
## drib_ball_control 6.0e-06 1.897360e-04
## drib_composure 5.0e-06 7.273428e-04
## lm 5.0e-06 8.332791e-04
## shoot_shot_power 4.0e-06 4.710711e-04
## drib_dribbling 3.0e-06 4.808724e-04
## shoot_penalties 3.0e-06 5.759000e-04
## cm 3.0e-06 5.026005e-04
## intl_rep 2.0e-06 5.235832e-04
## shoot_long_shots 2.0e-06 3.659654e-04
## shoot_volleys 2.0e-06 6.441853e-04
## def_marking 2.0e-06 7.417611e-05
## lb 2.0e-06 6.711235e-04
## revision 1.0e-06 1.261243e-04
## pace_acceleration 1.0e-06 5.150308e-04
## passing 1.0e-06 5.353912e-05
## pass_vision 1.0e-06 5.765672e-05
## pass_free_kick 1.0e-06 2.409543e-04
## pass_short 1.0e-06 9.700336e-05
## pass_long 1.0e-06 9.924002e-05
## pass_curve 1.0e-06 3.034369e-04
## rb 1.0e-06 7.709111e-05
## rwb 1.0e-06 3.676717e-04
## lwb 1.0e-06 2.132119e-04
## cdm 1.0e-06 2.426648e-04
## player_extended_name 0.0e+00 1.921599e-04
## quality 0.0e+00 2.537094e-07
## club 0.0e+00 6.188927e-05
## league 0.0e+00 7.947572e-07
## nationality 0.0e+00 2.400536e-06
## position 0.0e+00 1.704967e-07
## age 0.0e+00 2.136712e-06
## height 0.0e+00 3.217565e-05
## weight 0.0e+00 9.365505e-06
## pace 0.0e+00 5.916621e-05
## pace_sprint_speed 0.0e+00 1.932897e-04
## drib_agility 0.0e+00 1.064708e-04
## drib_balance 0.0e+00 3.905402e-05
## pass_crossing 0.0e+00 8.384388e-05
## defending 0.0e+00 1.499963e-05
## def_interceptions 0.0e+00 1.456562e-04
## def_heading 0.0e+00 3.601474e-04
## def_stand_tackle 0.0e+00 1.152901e-05
## def_slid_tackle 0.0e+00 7.553957e-07
## physicality 0.0e+00 9.393297e-05
## phys_jumping 0.0e+00 2.654808e-05
## phys_stamina 0.0e+00 4.839761e-05
## phys_strength 0.0e+00 7.739608e-05
## phys_aggression 0.0e+00 1.346716e-05
## pref_foot 0.0e+00 6.445403e-08
## att_workrate 0.0e+00 2.852868e-07
## def_workrate 0.0e+00 1.135587e-07
## weak_foot 0.0e+00 2.282594e-06
## cb 0.0e+00 1.557699e-04
After examining the models, we found that there was data suggesting that after removing outliers, the RMSE for goalkeepers, midfielders, and attackers had a significant decrease. Defenders, however, did not show much of a significant decrease in the RMSE. We found that the three most important factors for goalkeepers before clustering were their overall rating, dribbling reactions, and handling. After using the clustering method, we found that the most important factors were their overall rating, diving, and reflexes, which was not our initial hypothesis. For defenders, we found that the most important factors were the left back position overall rating, the left wing/back position overall rating, and the overall rating when we did not cluster, but when we did cluster, we found that the most important factors were overall rating, dribbling ball control, and shooting power. In conclusion, there were lots of factors that were surprising to us in determining the price of players in the ultimate team marketplace. Perhaps other machine learning methods could have given us different results and hence came to a different conclusion about the factors impacting our target variable.
We were able to achieve the results to answer our question of interest, but we wish we could have developed a more sophisticated model to describe the factors driving prices for players in Fifa ultimate team. Since we had significantly more data on players obtained using PS4 instead of Xbox and PC, we could have used a dataset containing information on all players from the Xbox console and PS4, if there is one. We are currently also assuming that every player has the same number of cards, which isn’t necessarily true. It could be interesting to look for a dataset that might have that information. We could have not taken the other position overalls as factors into consideration which could have changed our results. While working through this project, we found that it would be interesting to determine how much of factor the type of console would have on the overall price due to price discrepancies across the different platforms. When doing the clustering method on the different positions, the results were significantly different to our surprise. We would have liked to have done more in-depth study on that. Since the PDF function of player value has a gamma/exponential distribution, it would have been interesting to see what a gamma or exponential regression would look like. Lastly, future work on getting a multilinear regression model would have been useful to see by what amount an increase in a certain attribute by one would have on the overall price on average.
#Sources